QTL Verification and Candidate Gene Screening of Fiber Quality and Lint Percentage in the Secondary Segregating Population of Gossypium hirsutum

Fiber quality traits, especially fiber strength, length, and micronaire (FS, FL, and FM), have been recognized as critical fiber attributes in the textile industry, while the lint percentage (LP) was an important indicator to evaluate the cotton lint yield. So far, the genetic mechanism behind the formation of these traits is still unclear. Quantitative trait loci (QTL) identification and candidate gene validation provide an effective methodology to uncover the genetic and molecular basis of FL, FS, FM, and LP. A previous study identified three important QTL/QTL cluster loci, harboring at least one of the above traits on chromosomes A01, A07, and D12 via a recombinant inbred line (RIL) population derived from a cross of Lumianyan28 (L28) × Xinluzao24 (X24). A secondary segregating population (F2) was developed from a cross between L28 and an RIL, RIL40 (L28 × RIL40). Based on the population, genetic linkage maps of the previous QTL cluster intervals on A01 (6.70–10.15 Mb), A07 (85.48–93.43 Mb), and D12 (0.40–1.43 Mb) were constructed, which span 12.25, 15.90, and 5.56 cM, with 2, 14, and 4 simple sequence repeat (SSR) and insertion/deletion (Indel) markers, respectively. QTLs of FL, FS, FM, and LP on these three intervals were verified by composite interval mapping (CIM) using WinQTL Cartographer 2.5 software via phenotyping of F2 and its derived F2:3 populations. The results validated the previous primary QTL identification of FL, FS, FM, and LP. Analysis of the RNA-seq data of the developing fibers of L28 and RIL40 at 10, 20, and 30 days post anthesis (DPA) identified seven differentially expressed genes (DEGs) as potential candidate genes. qRT-PCR verified that five of them were consistent with the RNA-seq result. These genes may be involved in regulating fiber development, leading to the formation of FL, FS, FM, and LP. This study provides an experimental foundation for further exploration of these functional genes to dissect the genetic mechanism of cotton fiber development.


Introduction
Cotton fiber is an important raw material in industrial production [1].Owing to its high yield and wide adaptability, allotetraploid upland cotton is widely planted around the world, accounting for over 95% of the world's total cotton planting areas.However, its fiber quality is less attractive than sea-island cotton [1,2].With the continuous upgrading and innovation of textile technology, the requirements for the quality and quantity of cotton fibers are constantly increasing [3].Cotton fiber quality is mainly composed of fiber length (FL), fiber strength (FS), fiber micronaire (FM), fiber uniformity (FU), and fiber elongation (FE) [4].The lint percentage (LP) is a main indicator for evaluating cotton fiber yield [5].The cost-effectiveness of raw cotton fiber production requires high fiber quality that meets the technical requirements of the textile industry and a high yield potential that reimburses the agricultural costs.Therefore, effectively improving the fiber quality and yield of upland cotton remains one of the main goals of current cotton breeding projects [6].
Fiber quality and yield traits are complex traits controlled by multiple minor quantitative trait loci (QTLs)/genes and are easily affected by environments [7,8].Conventional breeding methods have made great contributions to the improvement of both fiber quality and yield.However, the limitations in the further simultaneous improvement of the two are becoming increasingly prominent due to the negative correlations between them, which become obvious under the current high-level breeding conditions [9].QTL mapping provides a powerful approach for cotton breeders to improve the fiber quality and yield traits of upland cotton via marker-assisted selection (MAS) [9,10].With the continuous improvement in the reference genome and the large-scale sequencing of a large number of bi-parental segregating and natural populations, a large number of QTLs and QTNs of fiber quality, yield traits, and biotic stresses, etc., have been identified and associated, providing important resources for cotton breeding programs via MAS [4,10,11].However, the practical value of these QTLs in MAS lies in their ability to pyramid the favorable alleles into newly developed cultivars.Furthermore, a large number of QTLs still need to be validated or fine-mapped for practical MAS applications or functional gene studies [12][13][14].With the improvement in genome and genome annotation information and the application of transcriptome technology, researchers will be able to identify candidate genes based on the annotation information and transcriptional expression profiles of genes within the QTL interval and explore gene functional research [3,12].Fang et al. [15] constructed an F 2 population (CCRI35 × Yumian1) containing 2484 individual plants to fine map an FS QTL, qFS07.1, to a DNA fragment of 62.6 kb (0.17 cM) containing four annotated genes.Through qRT-PCR and sequence comparison analysis, a leucine-rich repetitive protein kinase (LRR RLK) family protein-coding gene, Gh_A07G1749, was identified as its candidate gene.Islam et al. [3] used 27 SNP markers to finely map multiple QTLs, namely qFBS-c3, qSFI-c14, qUHML-c14, and qUHML-c24, in an intraspecific F 3 population (MD90ne × MD52ne) of upland cotton and accurately mapped them to the physical intervals of 4.4 Mb, 1.8 Mb, and 3.7 Mb in the reference genome.The receptor kinase pathway gene was identified as a candidate gene responsible for FS and FL based on the differential expression profiles and the amino acid mutation analysis of the gene between the two parents.Zang et al. [12] constructed four F 2 populations containing 1864 individuals through backcrossing four recombinant inbred lines (RILs) derived from the cross of Prema×86-1, RIL43, RIL98, RIL120, and RIL168, with 86-1, respectively.qFS-D3-1 was finely mapped to a fragment of 0.93 Mb (1.14 cM), which contained 23 annotated genes.Based on the transcriptional expression profiles of these genes during fiber development and gene sequencing, an allele with a 6 bp (GCCTCC) deletion of GhUBX (GH_D03G0985) gene was identified to be responsible for higher FS in Prema.GhUBX regulates fiber helix growth by degrading GhSPL1 in fiber cells through the ubiquitin 26S-proteasome pathway, leading to an increase in the number of cotton fiber helices and eventually improving fiber strength.Liu et al. [14] finely mapped qSI-A07-1 to 17.45 kb via an F 2 population and identified an allele with a deletion of 845 bp in the intron region of its candidate gene GH_A07G2179 (GhSI7; transcriptional regulator STERILE APETALA) responsible for regulating seed size.However, how these candidate genes altogether regulate cotton fiber quality development yield formation still keeps elusive.In the current climate changing conditions, research on the genetic regulation mechanisms of fiber yield and quality formation has become increasingly important, which may impose a great impact on the future development of both cotton cultivation and textile industry.
In a previous study, three QTL clusters were identified via a RIL population derived from an intraspecific upland cotton cross of Lumianyan28 and Xinluzao24 (L28 × X24) [16].The clusters mainly consisted of major-effect QTLs for their target traits of FS, FL, FM, and LP (Table 1).These loci may have potential implications for future variety improvement and for dissecting the formation mechanism of target traits.In this study, a secondary segregating F 2 population was constructed via a cross of L28 and RIL40, which was an excellent fiber quality RIL-derived L28 × X24.An F 2:3 population derived from F 2 was also constructed.Both F 2 and F 2:3 populations were applied to verify the genetic effect of the QTLs in the above three clusters, and the potential candidate genes were identified via analysis of the differentially expressed genes (DEGs) within the QTL interval based on an RNA-seq strategy.The results revealed that these QTLs will be of great significance in future breeding projects, and further dissection of the candidate genes will be beneficial to understanding their acting mechanism in cotton fiber development.
Table 1.Basic information of the QTL clusters in the primary QTL analysis in a previous study [16].Basic phenotypic statistics describing the FL, FS, FM, and LP of the parental lines, and the F 2 and F 2:3 populations are presented in Table 2.The RIL40 had higher FL and FS phenotypic values and lower FM and LP phenotypic values than those of L28, which were the same as those observed in the previous study (Table 2) [16].The t test revealed that the differences in these traits between RIL40 and L28 reached at least significance at p ≤ 0.05 (Table 2).The skewness and kurtosis evaluations showed that the phenotypes of all the target traits fit a normal distribution in both the F 2 and F 2:3 populations and that the F 2:3 phenotypic values could represent the variations and distributions of those of F 2 (Figure 1).Correlation analysis revealed that the phenotypic performances of these target traits were significantly positively correlated between the F 2:3 and F 2 generations.However, within each generation, the trait pairs of FL-FS and FM-LP were significantly positively correlated, while those of FL-FM/LP and FS-FM/LP were significantly negatively correlated in both the F 2:3 and F 2 (except FM-FS in F 2 and LP-FS in F 2:3 ) (Table 3).** indicate the correlation significances between different traits at the 0.01 levels.

Linkage Map Construction and QTL Mapping of the Target Loci
All polymorphic markers were used to genotype the 1961 individual plants of the F2 population (Table S1).Three genetic linkage groups were finally constructed for the target loci on chromosomes A01, A07, and D12 (Table 1), each of which contained 2, 14, and 4 molecular markers, spanning 12.25, 15.90, and 5.56 cM, respectively (Figure 2).The linkage groups of the previous study, the physical positions of the markers, and the linkage groups of the current study are presented in Figure 2a-c.The results revealed that the linkage groups of the current study were consistent with those of the previous study.

Linkage Map Construction and QTL Mapping of the Target Loci
All polymorphic markers were used to genotype the 1961 individual plants of the F 2 population (Table S1).Three genetic linkage groups were finally constructed for the target loci on chromosomes A01, A07, and D12 (Table 1), each of which contained 2, 14, and 4 molecular markers, spanning 12.25, 15.90, and 5.56 cM, respectively (Figure 2).The linkage groups of the previous study, the physical positions of the markers, and the linkage groups of the current study are presented in Figure 2a-c.The results revealed that the linkage groups of the current study were consistent with those of the previous study.
Namely, in the linkage group on A01, an FS QTL qFS-A01-1 was identified, which spanned a physical interval of 0.27 Mb (7.73-8.00Mb in the physical map).In the linkage group on A07, four QTLs, including qFL-A07-1, qFS-A07-1, qFM-A07-1, and qLP-A07-1 for each trait, respectively, were identified, which spanned a physical interval of 2.24 Mb (88.95-91.29 Mb in the physical map).In the linkage group on D12, an FL QTL qFL-D12-1 was identified, which spanned a physical interval of 0.11 Mb (0.48-0.59 Mb in the physical map) (Table 4, Figure 2c).QTL analysis of these three linkage groups on chromosomes A01, A07, and D12 of 1961 F 2 individuals and 356 F 2:3 lines revealed that the QTLs identified in the linkage groups were consistent with the ones identified at the same loci in the previous study.Namely, in the linkage group on A01, an FS QTL qFS-A01-1 was identified, which spanned a physical interval of 0.27 Mb (7.73-8.00Mb in the physical map).In the linkage group on A07, four QTLs, including qFL-A07-1, qFS-A07-1, qFM-A07-1, and qLP-A07-1 for each trait, respectively, were identified, which spanned a physical interval of 2.24 Mb (88.95-91.29 Mb in the physical map).In the linkage group on D12, an FL QTL qFL-D12-1 was identified, which spanned a physical interval of 0.11 Mb (0.48-0.59 Mb in the physical map) (Table 4, Figure 2c).In the QTL intervals, a total of seven DEGs, including one, five, and one on A01, A07, and D12, respectively, were identified, which were differentially expressed between the parental lines, L28 and RIL40, in the corresponding stages of fiber development via RNA-seq analysis (Table 5).In these DEGs, two, GH_A07G2180 and GH_A07G2209, were highly expressed at 10 DPA during fiber development, and their expression gradually decreased after 20 DPA; one gene, GH_A07G2203, was highly expressed at 20 DPA during fiber development; three genes, GH_A07G2222, GH_A07G2247, and GH_D12G0031, were highly expressed at 30 DPA during fiber development; one gene, GH_A01G0633, was similarly expressed at 10, 20, and 30 DPA during fiber development (Figure 2d).Genes in the QTL intervals of A01, A07, and D12 were screened via their FPKM values in developing fibers at 10, 15, 20, and 25 DPA of TM-1, which were fetched from the cotton functional genomic database (CottonFGD: https://cottonfgd.net/,accessed on 30 October 2022).The genes that had a mean FPKM value >0.5 were regarded as expression genes.A total of 79 genes from the interval on the A01 chromosome, 133 genes on the A07 chromosome, and 49 genes on the D12 chromosome were identified to have a dynamic expression during fiber development, forming six, six, and four distinct expression clusters, respectively (Figure 3).The results revealed that the DEG GH_A01G0633 was identified in expression cluster 1 of the interval on chromosome A01, which exhibited a steadily decreasing expression trend (Figure 3).The DEGs GH_A07G2180, GH_A07G2203, GH_A07G2209, GH_A07G2222, and GH_A07G2247 were identified in expression cluster 4, 6, 2, 5, and 1 on chromosome A07, respectively (Figure 3).The DEG GH_D12G0031 was identified in expression cluster 2 of the interval on chromosome D12, which showed a high expression at 15 DPA and then went down in slightly different styles (Figure 3).These results indicated the DEGs' involvement in the fiber development of cotton plants.qRT-PCR verification using fiber samples of L28 and X24 at 5, 10, 15, 20, 25, and 30 DPA confirmed the differential expression of the seven DEGs between L28 and X24.It also revealed that five DEGs, GH_A01G0633, GH_A07G2180, GH_A07G2222, GH_A07G2247, and GH_D12G0031, were consistent with the results of the RNA-seq analysis, while two DEGs, GH_A07G2203 and GH_A07G2209, were inconsistent (Figure 4).The consistent results of the RNA-seq, TM-1 gene expression data analysis, and qRT-PCR indicated the important roles of these five DEGs in cotton fiber development.

Plant Materials and Phenotypic Measurement
In a previous study, a G. hirsutum RIL population was established using a cross between two upland cultivars Lumianyan28 (L28) and Xinluzao24 (X24), L28 × X24 [16].L28 is a conventional cotton cultivar showing high yield potential, while X24 is a cultivar possessing high-quality fibers.The QTLs of the fiber quality and yield traits were identified, of which three cluster loci were selected for further analysis in the current study (Table 1).For this purpose, RIL40, a line of the L28 × X24 RIL population [16], which had the favorable alleles from X24, was selected to cross L28 to develop a secondary F 2 population.The developmental procedures were as follows: In the 2017 winter growing season, a cross of L28 × RIL40 was made at the experimental station of the Institute of Cotton Research in Sanya (Hainan province, China), the F 1 seeds were harvested and then planted in Anyang (Henan province, China) in the 2018 summer growing season.The F 1 plants were self-pollinated to obtain F 2 seeds in the 2018 summer growing season in Anyang.The F 2 population and parental lines, L28, RIL40, and X24, were planted in the 2019 summer growing season in Anyang.The F 2 were harvested per plant to form F 2:3 seeds, and an F 2:3 population was planted in two replications in a completely randomized block design in the 2020 summer growing season in Anyang.All naturally opened bolls were hand harvested per plant from the F 2 plants, and per line from the F 2:3 and the parental lines, respectively.The seed cotton of each sample was weighed and then ginned.The LP of each sample was evaluated, and the fiber quality traits were evaluated using the HVI (High Volume Instrument) system at the Institute of Cotton, Hebei Academy of Agriculture and Forestry Sciences (Shijiazhuang, China).The fiber quality traits include the FL (mm), FS (cN/tex), and FM.The descriptive statistics and correlation analysis of phenotype data were calculated using SPSS 21 software.Bar plots were created using OriginPro 2021 software.

Maker Development for Genotyping of the Secondary Population
Total genomic DNA was extracted from young leaves of F 2 individuals and the parental lines, L28 and RIL40, using a modified CTAB method [17].Three simple sequence repeat (SSR) markers, SWU2707, DPL0852, and DPL0757, reported in previous studies [16] were directly used to genotype the F 2 population (Table S1).De novo-designed molecular markers, SSR and Indel, were based on the TM-1 reference genome [18] within or adjacent to the physical intervals of the three target loci (Table 1).Each primer pair was designed in a ±200 bp sequence interval.All distinctive and unambiguous polymorphic markers between L28 and RIL40 were used to genotype the F 2 populations.The marker loci were named following their primer names.The genotyping results of the F 2:3 population were deduced from the results of the F 2 .

QTL Mapping
The genetic linkage maps of the three target regions were constructed using JoinMap 4.0 software [19].The conversion of the recombination frequencies to map distances (cM) used the Kosambi function [20].The QTLs were identified by composite interval mapping (CIM) using WinQTL Cartographer 2.5 software [21].Linkage groups and QTL distribution on the map were visualized using MapChart 2.2 software [22].

Candidate Gene Screening Based on DEG Analysis between L28 and RIL40
Total RNA was extracted from developing fiber samples of L28 and RIL40 at 10, 20, and 30 DPA, using TRIzol reagent (Tiangen, Beijing, China).Three biological replicates were performed for each sample.The raw data of the Illumina NovaSeq6000 sequencing platform had adaptor trimming, low-quality, and short reads processing with Fastp (v.0.20.0)[23] to obtain clean data.Quality control of the clean data was performed using Fastqc (v.0.11.5) [24].RNA-seq data analysis was performed using BMKCloud (www.biocloud.net,accessed on 24 May 2023).To obtain the location information of clean reads on the reference genome, the clean reads were aligned to and compared with the G. hirsutum (TM-1_V2.1)reference genome [18] using Hisat2 (v2.0.4) [25] and SAMtools [26].Then, the mapped reads were reassembled into a transcriptome by StringTie (v2.2.1) [27] based on the G. hirsutum (TM-1_V2.1)reference genome [18].The fragments per kilobase of transcript per million fragments of mapped reads (FPKM) values [28] of all genes were normalized using StringTie (v2.2.1) [27] software, which was used to evaluate the gene expression levels in the fiber developmental stages.The differentially expressed genes (DEGs) between L28 and RIL40 in the same development stage were analyzed using the DESeq2 (v1.30.1) [29] of the R package based on the criteria of |Fold Change| ≥ 2.0 and false discovery rate (FDR) < 0.01.

qRT-PCR Experiment
The expression of the candidate genes in the parents of the secondary population was again verified based on the RNA of fibers in different development periods (5,10,15,20,25, and 30 PDA) of the parents through a qRT-PCR experiment to validate the potential function of candidate genes in different fiber developments.The total RNA at each period of fiber development was isolated by the above and was converted to cDNA using the HiScript III RT superMix for qPCR (+gDNA wiper) reverse transcription (R323-01AA, Vazyme, Biotech, Nanjing, China).The qRT-PCR analyses were performed on an Applied Biosystems 7500 fast real-time PCR system (ABI) utilizing the chamQ universal SYBR qPCR master mix (Q711-02-AA, Vazyme, Biotech, Nanjing, China).The relative expression level of genes was calculated with the 2 −∆∆CT method [32].All primer sequences for the qRT-PCR analysis are listed in Table S1.The actin gene was used as a reference gene.

MAS Strategy in Breeding Practice
The screening of functional markers is the key step in MAS-based breeding projects.However, QTLs in previous studies were usually identified via the following key steps, i.e., (a) constructing a linkage segregation population, temporary or permanent; (b) genotyping the individuals of that population using an appropriate set of markers or marker collections; (c) phenotyping the individuals of the population in a certain environment; and (d) using the proper software to calculate the correlation between the genotype and phenotype.If an allele of a locus is significantly correlated to the expression of a trait, then it is thought that a QTL has been identified at this locus.Therefore, the QTLs are usually genetic background dependent and environment tagged.Obtaining effective QTLs that are not constrained by the genetic background and environment and using their markers in future practical breeding practices via MAS still remains of particular interest to researchers.Various studies have tackled this issue via several strategies [33,34].In a previous study, QTLs of FL, FS, FM, and LP were identified through linkage analysis [16].To validate the effectiveness of these QTLs, the current study selected three loci consisting of these QTLs or QTL clusters on chromosomes A01, A07, and D12 and used secondary segregation populations, including F 2 and its derived F 2:3 , to validate them.The result positively confirmed the effectiveness of QTL selection during early generations after hybridization.However, we also noticed that, compared to the previous results [16], the phenotypic variation rates of the target traits explained by the loci of A01 and D12 in the F 2 or F 2:3 generations were relatively low (Table 4).The dominant effect is a common phenomenon in QTL identification in temporary populations and it is not always in the same direction as the additive effect [35][36][37].The reason might be the opposite direction of the dominant effects of the same locus against additive effects.As the ultimate goal in a breeding project is to pyramid the loci of additive effects, except for the utilization of heterosis, the presence of such dominant effects does not affect the effectiveness of MAS in early generations.Moreover, the physical intervals of the three loci in this study were consistent with those of previous reports [14,33,34,[38][39][40][41][42].The results of this study indicate that the target traits, FL, FS, FM, and LP, are interrelated, as the QTLs of these traits are co-located in specific genomic regions.These loci could be effectively used for further research on the formation fiber quality and yield traits, as well as for future breeding projects via MAS, which necessitates an integrative consideration of the QTL compositions in the clusters and the distribution of QTL loci of each trait, especially the ones for the comprehensive improvement of multiple traits.

Function Validations of Candidate Genes
In this study, of the three QTL cluster locations, both the DEG analysis and qRT-PCR results demonstrated that GH_A01G0633 and GH_D12G0031 were candidate gene for QTL clusters in chromosomes A01 and D12, respectively.CBSX5 is a member of the CBS domain-containing protein family (PF00571), which might be involved in cell wall synthesis.Studies showed that the promoter region of the CBSX gene family members contained numerous stresses and phytohormone-responsive elements, indicating their involvement in regulating various stress responses and plant growth development via the plant hormone pathway in plants [43][44][45][46][47].In Arabidopsis, it was demonstrated that CBSX regulates H 2 O 2 levels and lignin polymerization [45], as well as secondary cell wall thickening of the endothecium during anther dehiscence [46], reactive oxygen species (ROS), and lignin deposition [44].Previous studies revealed that, in cotton, genes specifically expressed under stress were also specifically expressed during fiber development.In cotton, a study indicated that GhCBS genes were regulated under abiotic stress and hormonal treatments and in ovule and fiber development [48], indicating their significant impact on fiber development.HT1 (GH_D12G0031) is a member of the protein kinase superfamily (PF07714), which is involved in protein serine/threonine kinase activity and protein kinase activity.An early study identified that HT1 was important for the regulation of stomatal movements in response to CO 2 [49].Later research revealed that it also regulated red light-induced stomatal opening [50], which is a strong indicator of the water-use efficiency of a plant [51].Our results suggested that HT1 also played an important role in cotton fiber development; however, its mechanism of action still needs further clarification.
GH_A07G2222 in the interval of the QTL cluster on chromosome 7 was annotated possibly as BRASSINOSTEROID-INSENSITIVE1 (BRI1)-associated kinase 1 (BAK1)-interacting receptor-like kinase 1 (BIR1), a small leucine-rich repeat receptor-like kinase.In Arabidopsis, BIR1 was demonstrated to negatively regulate cell death pathways in plant BAK1-mediated pathogen-triggered immunity (PTI) signaling [52,53].The functioning of BIR1 is partially dependent on the salicylate (SA)-dependent resistance (R) protein pathway [54].In higher plants, BAK1 participates in multiple developmental processes through the brassinosteroid (BR) signaling pathway [55,56], including degrading H 2 O 2 via activation of CAT activity and the development of phloem vascular tissues [57].As a receptor, BAK1 forms ligandinduced complexes with different LRR-RLKs and flagellin-sensitive 2 in the regulation of vascular development and immune responses via the hormone pathway in plants [57,58], including ABA signaling [59].GH_A07G2247 was annotated to encode a member of the glycosyl hydrolase family 17 proteins (GHL17), which hydrolyzes 1,3-b-glucan polysaccharides found in the cell wall matrix [60].GHL17 is demonstrated to be involved in physiologically important processes in plants, including responses to biotic and abiotic stresses [61][62][63], defense against herbivores, and activation of phytohormones, lignification, and cell wall remodeling [64,65].The regulatory mechanisms of these candidate genes in fiber development still need validation.However, previous studies have shown that genes specifically expressed during fiber development are also specifically expressed under stresses [66,67], inferring that cotton fiber development and the stress responses of cotton plants may involve the same metabolic pathways.
In addition, in this QTL cluster interval, another two genes, GH_A07G2203 and GH_A07G2209, were also identified to have specific expression in fiber development.GH_A07G2203 was annotated as CONSTANS-like 9 (COL9), which is a transcription factor involved in regulating plant flowering and responses to abiotic stress [68,69].In rice, OsCOL9 interacted with OsRACK1 and enhanced rice blast resistance through salicylic acid (SA) and ethylene (ET) signaling pathways [68]; it also regulated the grain number formation of the main panicle in rice plants [69].In cotton, a study revealed that COL9 was involved in affecting flowering and the response to drought and salt stresses [70].GH_A07G2209 was identified as RAB GTPase homolog B1C (RABBIC), which is part of the Ras superfamily of small GTPases.RAB GTPases regulate vesicle formation, actin-and tubulin-dependent vesicle movement, and membrane fusion [71].The RAB GTPase family shares major trafficking elements related to the cell wall modification in ripe fruit, involving the trafficking of cell wall polymers and enzymes between cellular compartments in plants [72,73].These two candidate genes may also have a possible role in fiber development.In summary, these genes are likely potential candidate genes regulating cotton fiber development, which will be validated in future work.

Conclusions
In this study, based on the primary QTL identification results via a RIL population developed from L28 × X24, three important QTL or QTL cluster intervals on chromosomes A01, A07, and D12 were selected to conduct further dissection.An F 2 and its derived F 2:3 populations were developed from a cross of L28 × RIL40.Genetic linkage maps of these three intervals were constructed using molecule markers (SSR and Indel) using the F 2 secondary segregating population.The QTLs of fiber quality traits, FL, FS, and FM, and the yield trait LP on these three corresponding intervals were verified.Five DEGs were identified to have important roles in fiber development as potential candidate genes via RNA-seq strategy, TM-1 gene expression data analysis, and qRT-PCT verification.This study provides an experimental foundation for further exploration of these functional genes to dissect the genetic mechanism of cotton fiber development.
Author Contributions: R.L.: Conceptualization, Investigation, Data Analysis, Software, Manuscript Drafting.M.Z.: Field data collection and analysis and result discussion.Y.S.: Field data collection and analysis and result discussion.J.L.: Investigation, Data Curation, Formal Analysis.J.G.: Field Investigation, Data Collection, Formal Analysis.X.X.: Investigation, Methodology, Visualization.Q.C.: Supervision, Methodology, Formal Analysis.Y.Y.: Resources, Supervision, Funding Acquisition.W.G.: Conceptualization, Supervision, Manuscript Editing and Finalizing.All authors have read and agreed to the published version of the manuscript.

Funding:
The authors are grateful for the financial support provided by grants from the National Natural Science Foundation of China (32070560, 32070563), the Agricultural and Rural Intelligence Assistance Team for Xinjiang-High-quality Mechanical Harvesting Cotton Industrialization Team Project, the Natural Science Foundation of Xinjiang Uygur Autonomous Region (2021D01B114), and High Quality Cotton New Variety Zhongmiansuo 703 Efficient Technology Integration Demonstration Project of Kashgar Regional Science and Technology Plan (KS2023003).

Figure 1 .
Figure 1.The phenotypic distribution of fiber quality traits and lint percentage of F2 and F2:3 populations.Dotted diamond bar presents phenotype distribution of F2 population; Trellis presents phenotype distribution of F2:3 population.The curve on the graph represents the fitted normal distribution of the population.The red and blue arrows indicate the positions of L28 and RIL40 in the distribution, respectively.FL, fiber length; FS, fiber strength; FM, fiber mocronaire; LP, lint percentage.

Figure 1 .
Figure 1.The phenotypic distribution of fiber quality traits and lint percentage of F 2 and F 2:3 populations.Dotted diamond bar presents phenotype distribution of F 2 population; Trellis presents phenotype distribution of F 2:3 population.The curve on the graph represents the fitted normal distribution of the population.The red and blue arrows indicate the positions of L28 and RIL40 in the distribution, respectively.FL, fiber length; FS, fiber strength; FM, fiber mocronaire; LP, lint percentage.

Figure 2 .
Figure 2. QTLs of fiber quality traits and LP on genetic linkage maps.(a) QTLs of fiber quality traits (FL, FS, and FM) and LP on A01, A07, and D12 in primary linkage analysis of RIL population; (b) and (c) QTLs of fiber quality traits (FL, FS, and FM) and LP on the physical maps, and on the linkage maps of secondary segregating population; (d) differentially expressed genes (DEGs) in developing fibers between L28 and RIL40 in the target QTL intervals.

Figure 2 .
Figure 2. QTLs of fiber quality traits and LP on genetic linkage maps.(a) QTLs of fiber quality traits (FL, FS, and FM) and LP on A01, A07, and D12 in primary linkage analysis of RIL population; (b,c) QTLs of fiber quality traits (FL, FS, and FM) and LP on the physical maps, and on the linkage maps of secondary segregating population; (d) differentially expressed genes (DEGs) in developing fibers between L28 and RIL40 in the target QTL intervals.

Figure 3 .
Figure 3. Expression clustering of the dynamically expressed genes in the QTL intervals on chromosomes A01 (6.70-10.15Mb), A07 (85.48-93.43Mb), and D12 (0.40-1.43 Mb) in TM-1 gene expression database.Each cluster presents a similar gene expression profiling.The red zigzag line in the figure presents the fitted expression trend of each gene cluster.The yellow lines represent the gene expression profiles are more proximal to the fitted expression trend, while the green lines less proximal to the fitted expression trend.

Figure 3 .
Figure 3. Expression clustering of the dynamically expressed genes in the QTL intervals on chromosomes A01 (6.70-10.15Mb), A07 (85.48-93.43Mb), and D12 (0.40-1.43 Mb) in TM-1 gene expression database.Each cluster presents a similar gene expression profiling.The red zigzag line in the figure presents the fitted expression trend of each gene cluster.The yellow lines represent the gene expression profiles are more proximal to the fitted expression trend, while the green lines less proximal to the fitted expression trend.

Plants 2023 , 15 Figure 3 .
Figure 3. Expression clustering of the dynamically expressed genes in the QTL intervals on chromosomes A01 (6.70-10.15Mb), A07 (85.48-93.43Mb), and D12 (0.40-1.43 Mb) in TM-1 gene expression database.Each cluster presents a similar gene expression profiling.The red zigzag line in the figure presents the fitted expression trend of each gene cluster.The yellow lines represent the gene expression profiles are more proximal to the fitted expression trend, while the green lines less proximal to the fitted expression trend.

Figure 4 .
Figure 4. qRT-PCR analysis of candidate genes during fiber development of L28 and RIL40.*, ** and *** indicate the difference between L28 and RIL40 reaching a significant level at p < 0.05, p < 0.01 and p < 0.001 in t-test, respectively.

Figure 4 .
Figure 4. qRT-PCR analysis of candidate genes during fiber development of L28 and RIL40.*, ** and *** indicate the difference between L28 and RIL40 reaching a significant level at p < 0.05, p < 0.01 and p < 0.001 in t-test, respectively.

Table 2 .
Fiber quality traits and LP of parental lines and secondary segregating populations of F 2 and F 2:3 .and ** indicate that the difference between L28 and RIL40 reaches significance levels of p < 0.05 and p < 0.01, respectively, in t-test.|AVDP| represents absolute values of difference between L28 and RIL40. *

Table 3 .
Correlation analysis of fiber quality traits and LP of secondary segregating populations of F2 and F2:3.

Table 3 .
Correlation analysis of fiber quality traits and LP of secondary segregating populations of F 2 and F 2:3 .
** indicate the correlation significances between different traits at the 0.01 levels.

Table 4 .
QTL verification results of fiber quality traits and LP on A01, A07, and D12 in secondary segregating populations of F 2 and F 2:3 .

Table 5 .
Seven DEGs/candidate genes of between L28 and RIL40 identified via RNA-seq at different fiber development stages.